Accelerating the EM Algorithm through Selective Sampling for Naive Bayes Text Classifier

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier

With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...

متن کامل

Reversing and Smoothing the Multinomial Naive Bayes Text Classifier

Abstract. The naive Bayes text classifier has long been a core technique in information retrieval and, more recently, it has emerged as a focus of research itself in machine learning. This paper is concerned with the naive Bayes text classifier in its multinomial model instantiation. This model and an “equivalent” reversed version proposed here are interpreted under the statistical framework of...

متن کامل

Training a Naive Bayes Classifier via the EM Algorithm with a Class Distribution Constraint

Combining a naive Bayes classifier with the EM algorithm is one of the promising approaches for making use of unlabeled data for disambiguation tasks when using local context features including word sense disambiguation and spelling correction. However, the use of unlabeled data via the basic EM algorithm often causes disastrous performance degradation instead of improving classification perfor...

متن کامل

Regularization and Averaging of the Selective Naive Bayes classifier

Naïve Bayes classifier has proved to be very effective on many real data applications. Its performances usually benefit from an accurate estimation of univariate conditional probabilities and from variable selection. However, although variable selection is a desirable feature, it is prone to overfitting. In this paper, we introduce a new regularization technique to select the most probable subs...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: The KIPS Transactions:PartD

سال: 2006

ISSN: 1598-2866

DOI: 10.3745/kipstd.2006.13d.3.369